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Abstract. This is an announcement of the proof of the inverse conjecture 
for the Gowers U a+1 [N]-norm for all s ^ 3; this is new for s ^ 4, the cases 
s = 1,2,3 having been previously established. More precisely we outline a 
proof that if / : [N] — > [—1, 1] is a function with ll/llj/s+ijjyj <5 then there 
is a bounded-complexity s-step nilsequence F(g(n)F) which correlates with /, 
where the bounds on the complexity and correlation depend only on s and 
<5. From previous results, this conjecture implies the Hardy-Littlewood prime 
tuples conjecture for any linear system of finite complexity. In particular, one 
obtains an asymptotic formula for the number of fc-term arithmetic progres- 
sions pi < p2 < ■ ■ ■ < Pk ^ N of primes, for every k ^ 3. 



1. Introduction 

This is an announcement and summary of our much longer paper [20] , the pur- 
pose of which is to establish the general case of the Inverse Conjecture for the 
Gowers norms, conjectured by the first two authors in [151 Conjecture 8.3]. If N 
is a (typically large) positive integer then we write [N] := {1, . . . , iV}. Throughout 
the paper we write T> = {z G C : \z\ ^ 1} . For each integer s ^ 1 the inverse conjec- 
ture GI(s), whose statement we recall shortly, describes the structure of functions 
/ : [N] — » V whose (s + l) st Gowers norm ||/||f/s+in\n is large. These conjectures 
together with a good deal of motivation and background to them are discussed in 
[HI HU US] . The conjectures GI(1) and GI(2) have been known for some time, the 
former being a straightforward application of Fourier analysis, and the latter being 
the main result of [Tl] (see also [3D] for the characteristic 2 analogue). The case 
GI(3) was also recently established by the authors in [TpJ. In this note we announce 
the resolution of the remaining cases GI(s) for s ^ 3, in particular reproving the 
results in [19] . 

We begin by recalling the definition of the Gowers norms. If G is a finite abelian 
group, d ^ 1 is an integer, and / : G — > C is a function then we define 

(1-1) ll/ll^(G) := (E xM ,..Me G A hl . . . & h J(x)f 2d , 

where A^f is the multiplicative derivative 

A h f(x) :=f(x + h)W) 

and E xe xf{x) := J2xex f( x ) denotes the average of a function / : X — > C on 
a finite set X. Thus for instance we have 

||/||t72 (G) := (E xMMeG f(x)f(n + h{)f{n + h 2 )f(n + h t + h 2 j) ^ . 

One can show that U d (G) is indeed a norm on the functions / : G — » C for any 
d ^ 2, though we will not need this fact here. 
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In this paper we will be concerned with functions on [N], which is not quite a 
group. To define the Gowers norms of a function / : [N] C, set G := Z/7VZ 
for some integer N > 2^^, define a function / : G — >• C by f[x) — f(x) for 
x = l,...,N and f(x) = otherwise and set 

ll/ll(7 d [A] : = II II / 1 1 1 [iV] || C^rf (C) ' 

where lnyi is the indicator function of [N]. It is easy to see that this definition 
is independent of the choice of N. One could take N := 2 d N for definiteness if 
desired. 

The Inverse conjecture for the Gowers U s+1 [N]-norm, abbreviated as GI(s), 
posits an answer to the following question. 

Question 1.1. Suppose that f : [N] — > T> is a function and let 8 > be a positive 
real number. What can be said if \\f\\u s + 1 [N] ^ S? 

Note that in the extreme case 5 = 1 one can easily show that / is a phase poly- 
nomial, namely f(n) = e(P(n)) for some polynomial P of degree at most s. Further- 
more, if / correlates with a phase polynomial, that is to say if \E, n£ ^f(n)e(P(n))\ ^ 
5, then it is easy to show that ||/||t/«+i[jv] c (<^)- It is natural to ask whether the 
converse is also true - does a large Gowers norm imply correlation with a polyno- 
mial phase function? Surprisingly, the answer is no, as was observed by Gowers |10j 
and, in the related context of multiple recurrence, somewhat earlier by Furstenberg 
and Weiss [9] . The work of Furstenberg and Weiss draws attention to the role of 
homogeneous spaces G/T of nilpotcnt Lie groups, and subsequent work of Host and 
Kra |24| provides a link, in an ergodic-theoretic context, between these spaces and 
certain scminorms with a formal similarity to the Gowers norms under discussion 
here. Later work of Bergelson, Host and Kra 3 highlights the role of a class of 
functions arising from these spaces G/T called nilsequences. The inverse conjecture 
for the Gowers norms, first formulated precisely in §8 of [15], postulates that this 
class of functions (which contains the polynomial phases) represents the full set of 
obstructions to having large Gowers norm. 

Here is that precise formulation of the conjecture. Recall that an s-step nil- 
manifold is a manifold of the form G/T, where G is a connected, simply-connected 
nilpotent Lie group of step at most s (i.e. all (s + l)-fold commutators of G are 
trivial), and T is a lattice (a discrete cocompact subgroup of G). 

Conjecture 1.2 (GI(s)). Let s ^ be an integer, and let < 6 ^ 1. Then there 
exists a finite collection A4 S ^$ of s-step nilmanifolds G/T, each equipped with some 
smooth Riemannian metric dc/r as well as constants C(s,S),c(s,S) > with the 
following property. Whenever N ^ 1 and f : [N] — > V is a function such that 
ll/llc/ a + 1 [ivi ^ 3> there exists a nilmanifold G/T e M. s ,8, some g £ G and a function 
F : G/T — > V with Lipschitz constant at most C{s,5) with respect to the metric 
d-G/T, such that 

\E ne[N] f(n)F(g n x)\^c(s,S). 

Let us briefly review the known partial results on this conjecture (in no particular 
order) : 

(i) GI(0) is trivial. 

(ii) GI(1) follows from a short Fourier-analytic computation. 
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(iii) GI(2) was established five years ago in [14], building on work of Gowers 

EE]- 

(iv) GI(3) was established, quite recently, in [19] , 

(v) In the extreme case S = 1 one can easily show that /(n) = e(P(n)) for 
some polynomial P of degree at most s, and every such function is an 
s-step nilsequence by a direct construction. See, for example, [14] for the 
case s = 2. 

(vi) In the almost extremal case S ^ 1 — e s , for some e s > 0, one may see that 
/ correlates with a phase e(P(n)) by adapting arguments first used in the 
theoretical computer-science literature pQ. 

(vii) The analogue of GI(s) in ergodic theory (which, roughly speaking, cor- 
responds to the asymptotic limit N — > oo of the theory here; see [25] for 
further discussion) was formulated and established in |24j , work done inde- 
pendently of the work of Gowers (see also the earlier paper [13]). This work 
was the first place in the literature to link objects of Gowers-norm type 
(associated to functions on a measure-preserving system (X,T,/i)) with 
flows on nilmanifolds, and the subsequent paper [3] was the first work to 
underline the importance of nilsequences. The formulation of GI(s) by 
the first two authors in [15] was very strongly influenced by these works. 
For the closely related problem of analysing multiple ergodic averages, the 
relevance of flows on nilmanifolds was earlier pointed out in [8] |9j [28] . 
building upon earlier work in [5]. See also [22l [36] for related work on 
multiple averages and nilmanifolds in ergodic theory. 

(viii) The analogue of GI(s) in finite fields of large characteristic was established 
by ergodic-theoretic methods in [U [33] . 

(ix) A weaker "local" version of the inverse theorem (in which correlation takes 
place on a subprogression of [TV] of size ~ 7V Cs ) was established by Gowers 
[llj . This paper provided a good deal of inspiration for our work here. 

(x) The converse statement to GI(s), namely that correlation with a function 
of the form n i-> F(g n x) implies that / has large U s+1 [7V]-norm, is also 
known. This was first established in [141, Proposition 12.6], following ar- 
guments of Host and Kra [24] rather closely. A rather simple proof of this 
result is given in [HI Appendix G]. 

The aim of this announcement is to outline an argument for the general case. 
Details may be found in the much longer paper [20] . 

Theorem 1.3. For any s ^ 3, The inverse conjecture for the U s+1 [N]-norm, 
GI(s), is true. 

By combining this result with the previous results in [151 117] we obtain a quan- 
titative Hardy-Littlewood prime tuples conjecture for all linear systems of finite 
complexity; in particular, we now have the expected asymptotic for the number of 
primes p\ < . . . < pk ^ N in arithmetic progression, for every fixed positive integer 
k. We refer to [TS] for further discussion, as we have nothing new to add here 
regarding these applications. Several further applications of the GI(s) conjectures 
are given in [5] [TS] • 

We remark that an alternative strategy towards the inverse conjecture and re- 
lated problems is currently being developed by Balazs Szegedy in an ongoing series 
of papers [311 [32l [33] . There are some similarities in method between these papers 
and ours, the most obvious being a reliance on nonstandard analysis to make the 
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algebraic manipulations easier. In other respects the methods of Szcgcdy are closer 
to the ergodic theory methods of Host and Kra [24] , whereas ours are ultimately 
based on the Fourier-analytic methods of Gowers [TD1 [UJ ■ 

In order to avoid some notational and technical difficulties, the presentation in 
this announcement will be non-rigorous, focusing on various model special cases and 
ignoring some fine distinctions. We will indicate these non-rigorous simplifications 
throughout this paper as "cheats" . 

Acknowledgements. BG was, for some of the period during which this work was 
carried out, a fellow of the Radcliffe Institute at Harvard. He is very grateful to 
the Radcliffe Institute for providing excellent working conditions. TT is supported 
by NSF Research Award DMS-0649473, the NSF Waterman award and a grant 
from the MacArthur Foundation. TZ is supported by ISF grant 557/08, an Alon 
fellowship and a Landau fellowship of the Taub foundation. All three authors are 
very grateful to the University of Verona for allowing them to use classrooms at 
Canazei during a week in July 2009. This work was largely completed during that 
week. 

2. Reduction to an integration problem 

Our proof of GI(s) follows the strategy used to establish the s = 2 case in [14] and 
the s = 3 case in [19] ■ these methods in turn being based on the earlier arguments 
of Gowers [TOl [TT]. In each case, one uses GI(s — 1) as an induction hypothesis 
to assist in proving GI(s). To pass from GI(s — 1) to GI(s), one has to perform a 
"cohomological" task, namely that of showing that a certain "cocycle" is essentially 
a "coboundary" (or showing that a certain "closed" form is essentially "exact"). 
This cohomological task is by far the most difficult portion of the argument, and 
will be discussed in more detail in later sections. We focus for now on the reduction 
to that goal. 

Cheat 2.1. It will be convenient to suppress dependence on parameters such as 8, 
and instead use asymptotic notation such as <C or 0(1) liberally. In the full paper 
|20) . we will in fact use the language of nonstandard analysis to systematically 
suppress all of these parameters and make asymptotic notation such as this rigorous. 
Here, however, we will avoid the use of this language and instead rely on more 
informal terminology such as "bounded" or "large" . 

Fix a positive integer s ^ 3 and assume GI(s — I) as an induction hypothesis. 
Our goal, of course, is to prove GI(s). Suppose then that we have a function 
/ : [N] — > T> with ||/||;j s + 1 W1 ^ 1> our a i m is to show that / correlates with some 
nilsequence x( n ) °f s f e P s m t ne sense that 

|En e[w] /(n)x(n)| > 1. 

Here x( n ) i s a function of the form F(g n x), where F is a Lipschitz function with 
bounded Lipschitz norm on an s-step nilmanifold G/Y chosen from a bounded list 
of possibilities, g <E G, and x £ G/Y. A simple example of an s-step nilsequence to 
keep in mind for now is x( n ) = e(an s ), where a € K/Z and e(x) := e 27 ™ 21 is the 
standard character. We caution however that this is not an especially representative 
example. Further examples will be discussed later on. 
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Using the identity 

( 2 -!) Il/lly+^z/ivz) = ( E /iez/A>zl|A'i/llc/s(z/A>z)) 1 ^ 2 ' 

(extending / by zero outside of [N]) it is a simple matter to conclude that 

\\A h f\\ unN] > 1 

for many h 6 [— N, N], by which we mean for all h in a subset C [— N, N] 
with ^> N. Applying the hypothesis GI(s — 1), we conclude that for many 
h G [-N, N], there exists a nilsequence Xh of step s — 1 which correlates with A/j/, 
that is to say 

(2.2) |E ne[JV] A h /(n)xh(n)| > 1. 

Our goal is to show that / correlates with an s-step nilsequence 9. Hcuristically, 
then, we expect the (s — l)-step nilsequences Xh to behave like a derivative Ah9 
of such a nilsequence. Suppose that we are in a situation where the Xh do indeed 
"behave like" A^9 in an ostensibly rather weak way, namely 

(2.3) Xh=A h 9-i) h 

where the i[>h are "lower-order" (s — 2)-step nilsequences. Then we can rewrite (12. 2[) 
as 

\E ne[N] A h (f8)(n)Mn)\ > 1- 
Using the converse to GI(s — 2) (see e.g. p~9] Appendix G]), we conclude that 

\\A h (f0)\\ us - 1[N] »1 

for many h s [— iV, N}. Using (|2.1I) (with s — 1 in place of s), we conclude that 

\\m\u°[N] »i- 

By a further appeal to the inductive hypothesis GI(s — 1), we have 

\E ne[N] f(n)0(n)4>(n)\ > 1 

for some (s — l)-step nilsequence ip. Since Oip is an s-step nilsequence, we obtain 
the claim. 

We may thus formulate our "cohomological" task more precisely: we must show 
that h i-> Xh is a "coboundary" in the sense that (|2.3[) holds for many /i, and some 
s-step nilsequence 9 and (s — l)-step nilsequences ?/v 

Cheat 2.2. Actually, this is an oversimplification in a number of minor ways. For 
instance, it is convenient to allow the two factors of 9 that appear in Ah9(n) — 
9{n + /iWn) (|2.3|) to be distinct. In other words, we have a representation 

(2.4) Xh = 9(n + h)W(n)Mri) 

for some nilsequences 9, 9' of degree s. The above arguments can be adapted to 
this case by using the Cauchy-Schwarz-Gowers inequality (see [11]) to decouple 9 
and 9'. Secondly, for technical reasons having to do with a topological obstruction 
that we will discuss in the next section, the nilsequences here will be vector- rather 
than scalar-valued. Finally, in the actual proof, one needs to modify Xh at various 
stages of the argument to a slightly different nilsequence x'h which still correlates 
with Ahf, and so (|2.3[) would apply to the nilsequences x'h rather than Xh- 

To keep the exposition simple (at the expense of strict accuracy) , we will ignore 
these details and pretend that our goal is to establish a representation of the form 
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3. NlLCHARACTERS 

Our arguments are geared towards the case s ^ 3, but let us temporarily consider 
the s = 2 case as motivation. In that case, the Xh are 1-step nilsequences. It 
is not difficult to see that such sequences take the form Xh(n) = F(£hn) where 
F : (K/Z) d — > C is a Lipschitz function on a torus of bounded dimension d = O(l), 
and £h E R d is a vector-valued frequency. These sequences were obtained from the 
hypothesis GI(1), which asserts that functions of large E/ 2 -norm correlate with a 
1-step nilsequence. 

The space of 1-step nilsequences is in some sense "generated" by a special type of 
1-step nilsequence, namely the Fourier characters n t— > e(£n) where £ 6 R is some 
frequency. Indeed, from Fourier analysis or the Stone- Weierstrass theorem it is easy 
to see that every 1-step nilsequence can be approximated uniformly to arbitrary 
accuracy by a bounded linear combination of Fourier characters. In particular, 
GI(1) implies that functions of large t/ 2 -norm correlate with a Fourier character. 

Fourier characters have several additional pleasant properties inside the space of 
1-step nilsequences. For instance, we have the following facts. 

(i) They always have magnitude 1, and can therefore be inverted by their 
conjugate: e(£n)e(£n) = 1. 

(ii) They form an abelian group under multiplication. 

(iii) They are translation-invariant modulo lower order terms: for any h, e(£(n+ 
h)) and e(£n) differ only by a constant depending on h (i.e. a 0-step nilse- 
quence) . 

(iv) The mean E n£ [jv]e(£n) of a Fourier character is negligible unless the fre- 
quency £ is extremely small (more precisely, if £ = 0(1 /N)), in which case 
the character e(£n) is "essentially constant" (and thus essentially a 0-step 
nilsequence) . 

For the more general argument, as in many other places [161 1191 [24] . it is conve- 
nient to define a notion of nilcharacter in such a way that analogues of the above 
four properties are still satisfied. 

For s ^ 2, an s-step nilmanifold G/T is usually not a torus; however, it is always 
a torus bundle over an (s — l)-step nilmanifold G/G S T with structure group equal to 
the torus G s /T s , where G = Go = G\ 2 G2 2 ■ ■ • 3 G s {id} is the lower central 
series of the s-step nilpotent group G, and Ti := T n Gi. An s-step nilcharacter is 
then an s-step nilsequence n i-> F(g n x), where j€G,i£ G/T, and F is a Lipschitz 
function with \F\ = 1 pointwise and obeying the vertical frequency condition 

(3.1) F(g s x) = e(£(g s ))F(x) 

for all x 6 G/T, g s £ G s , and for some continuous homomorphism £ : G s — >• R/Z 
that annihilates T s . The homomorphism £ is referred to as the vertical frequency 
of the nilcharacter. 

Cheat 3.1. For technical reasons it is convenient to generalise the concept of a 
nilcharacter by replacing the lower central series G — Gq — G\ 2 G2 2 • ■ • by 
a more general filtration G — G(p) 2 Gm 2 G(2) 2 • ■ • obeying the inclusion 
[G(i),G(j)] C C?(j+j), and replacing the linear sequence n h-> g"a; by a more gen- 
eral polynomial sequence n t— > g(n) adapted to this filtration. This generalisation 
is needed in order to obtain a clean quantitative equidistribution theory for nilse- 
quences and nilcharacters, as explained in some detail in [16]. We will however 
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gloss over the distinction between linear sequences on nilmanifolds and polynomial 
sequences on filtered nilmanifolds here. 

A basic example of an s-step nilcharacter is a polynomial phase n i— > e(P(n)), 
where P : Z — > K./Z is a polynomial of degree at most s. An important fam- 
ily of near-examples of nilcharacters come from the more general class of bracket 
polynomial phases, of which the bracket quadratic phase n H ► e(an\j3nj) for some 
a,/?£l (with |_-J being the greatest integer function) is a simple model example. 
This sequence can almost be expressed as a 2-step nilcharacter on the Heisenberg 
nilmanifold, which is often presented using 3x3 matrices (see e.g. [21 [19]). Here 
we present the same construction slightly more abstractly, since this will be helpful 
later. 

Consider, then, the free 2-step nilpotent Lie group G generated by elements e\, e 2 
such that all commutators of order 3 or higher, such as [ei, [e±, e 2 ]], are trivial. Here, 
as is fairly standard in group theory, we write [x, y) = x~ 1 y~ 1 xy. A typical element 
of G has the form {t\,t 2 -,t\ 2 ) :— e^e^ 2 \e\, e 2 ] tl2 , t\,t 2 ,t\ 2 & R, and multiplication 
in these coordinates is given by 

(tl, t 2l ti 2 ) * (ti,t 2 ,t 12 ) — (*1 + *1>*2 + t' 2 , t\1 + t l2 + t x t2)- 

In particular we may identify the discrete subgroup Y consisting of those elements 
with integer coordinates. Then G/T is a nilmanifold and a given point with coor- 
dinates (ti,t 2 ,ti 2 ) is equivalent under the right action of Y to the point 

({tl},{t 2 },{tl2-L*2jil}). 

This identifies those points of G with coordinates satisfying ^ t\, t 2 , £12 ^ 1 as a 
fundamental domain for the right action of Y on G. 

One can easily calculate, for specific g,x € G, coordinates for g n x in the fun- 
damental domain for G/Y. In so doing one already sees objects such as an\J3n\ 
making an appearance. These calculations are even easier if, instead, we look at 
g(n)Y with g(n) := ef n e 2 n , this being a example of a polynomial sequence on the 
Heisenberg group G (adapted to the lower central series filtration on G). We have 

g(n)Y= ({an},{(in},{-[(in\an})Y. 

In particular we see that 

F(g(n)Y) = e(an[/3n\), 
where F : G/Y —> C is the function defined by 

F((x,y,z)Y) :=e(-z) 

when ^ x, y, z < 1. 

Why, then, is this a near example of a nilsequence and not an actual exam- 
ple? The answer lies in the function F, which is unfortunately discontinuous at 
the edges of the fundamental domain. This is inevitable due to the twisted nature 
of the torus bundle that forms the Heisenberg nilmanifold. However, if one allows 
nilscquences to be vector-valued instead of scalar- valued, one can avoid this topo- 
logical obstruction. For instance, if 1 = 771 (a;) 2 + rj 2 {x) 2 is a partition of unity on 
M/Z with 77!, 772 supported in [0.1,0.9] and [—0.4,0.4] (say) respectively, then the 
vector- valued sequence 

(3.2) n 1 — y (e(ari[/3r7j)77 1 (/377 mod 1), e(an[(3n \)V2(fi n mod 1)) 
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will be a nilsequence (taking values in the unit sphere S* 3 of C 2 ) associated to 
the Heisenberg nilmanifold; the piecewise discontinuities of the greatest integer 
part function have been avoided by use of the cutoffs 771 , 772 , making the relevant 
function F genuinely Lipschitz and not merely piecewise Lipschitz. 

Cheat 3.2. As one can see, the vector-valued nilcharacters such as (|3.2[) are more 
complicated than their scalar almost- nilcharacter counterparts such as e (an [(3n\) . 
To avoid some distracting notational complications, we will cheat by pretending that 
sequences such as e(an\_(3n\) are genuine nilcharacters. With this cheat, we can 
pretend that all nilsequences involved are scalar- valued rather than vector- valued, 
and we can use bracket polynomial phases as motivating examples of nilsequences. 
For instance, with this cheat, e(an 2 ) and e(an\_/3n\) are 2-step nilcharacters, 

e(an 3 ), e(an\_j5n 2 \) , e(an 2 [(3n\), e(an[[Pn\"fn\) , e(an\_j3n\ [771] ) 

are 3-step nilcharacters (for a, /3, 7 £ R), and so forth. Indeed, there is a sense in 
which bracket polynomial phases are essentially the only examples of nilcharacters; 
see [27] for further discussion (and [14] for a discussion of the 2-step case). 

Nilcharacters enjoy analogues of the four useful properties mentioned earlier: 

(i) They have magnitude 1 (and are thus essentially inverted by their complex 
conjugations, if this statement is interpreted suitably in the vector-valued 
case). 

(ii) They (essentially) form an abelian group under multiplication (again using 
a suitable interpretation of this statement in the vector- valued case, using 
tensor products). 

(iii) They are essentially translation-invariant modulo lower step errors, much 
as a polynomial P(n) of degree s is translation-invariant modulo degree 
(s — 1) errors. In particular, the derivative A^O of an s-step nilcharacter 
is an (s — l)-step nilsequence. 

(iv) The mean E„ e [jv]X( n ) °f a nilcharacter is negligible unless x can De rep- 
resented as an (s — l)-step nilsequence. This property is a consequence of 
the quantitative equidistribution theory of nilsequences |16j . 

By using Fourier analysis or the Stone- Weierstrass theorem much as in the 1-step 
case, one can show that any (s — l)-step nilsequence can be approximated uniformly 
to arbitrary accuracy by a bounded linear combination of (s — l)-step nilcharacters. 
Because of this, we can assume without loss of generality that the Xh in (|2.2[) are 
(s — l)-step nilcharacters, rather than merely (s — l)-step nilsequences. That is, we 
assume henceforth that 

(3.3) \E ne[N] A h f(n) X h(n) \ » 1 

for many h G [— N, N], where the Xh are (s — l)-step nilcharacters. 

Remarks. The space of (s — l)-step nilcharacters, modulo (s — 2)-step errors, is 
denoted Symb s _ 1 (Z) in [2D]; thus for instance Symb x (Z) = Z = R/Z is the Pon- 
tryagin dual of Z, and the abelian group Symb s _ 1 (Z) for higher s can be viewed as 
a higher order generalisation of the Pontryagin dual. As hinted in the above dis- 
cussion, there is a close relationship between nilcharacters and bracket polynomial 
phases; some aspects of this relationship are explored in |27j . These two types of 
object can be viewed as two different perspectives on the same concept, with the 
nilcharacter perspective being superior for understanding equidistribution, and the 
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bracket polynomial perspective being superior for direct (albeit messy) algebraic 
manipulation. (The cquidistribution of bracket polynomials is studied directly in 
[21) . but it seems cleaner to study equidistribution via nilcharacters instead.) Fur- 
thermore, bracket polynomials are a useful source of examples for building intuition. 

In an early version of the full paper [20] , the theory of both nilcharacters and 
bracket polynomials, together with the connections between them, was extensively 
developed. Unfortunately this led to a significant increase in the length of the pa- 
per. The current version of the paper discards the theory of bracket polynomials, 
and works purely through the formalism of nilcharacters. This has shortened and 
simplified the paper considerably, albeit at the cost of making some of the alge- 
braic manipulations more abstract. In this announcement, we will rely on bracket 
polynomial examples for motivation. However, we will indicate at various junctures 
how various concepts concerning bracket polynomials may be translated into the 
nilcharacter framework. 

4. Approximate linearity 

We return to the problem of establishing a representation of Xh that is roughly 
of the form (|2.3[) . that is to say 

(4.1) Xh = &h0-i>h 

for some s-step nilsequence 9 and some (s — 2)-step "errors" tph. As a consequence 
of the discussion in the preceding section, we may assume that each Xh is a nilchar- 
acter. 

Suppose for the moment that Xh(n) was in fact exactly equal to Ah0(n) for some 
s-step nilcharacter 9 for all n, h € Z. Then Xh would necessarily obey the cocycle 
equation 

(4-2) Xh+k (n) = Xh (n)xk (n + h) 

for all n,h,k e Z. 

In the converse direction, the cocycle equation (|4.2[) is a sufficient condition to 
have a representation of the form Xh — ^h,9 for some function 9 : Z — > S 1 (not 
necessarily a nilcharacter). Indeed, one can simply set 9{n) := v n (0), since ()4.2j) 
then gives 

9{n + h)=9{n)xh{n) 

for all n, h. To put it another way, when one works in the category of all unit mag- 
nitude sequences, rather than the category of nilcharacters, the first cohomology 
group H X (7L, S 1 ) of the integers is trivial. 

These observations then suggest a strategy for obtaining the desired represen- 
tation (|4.ip for the (s — l)-step nilcharacters Xh- One would first show that the 
nilcharacters Xh obey some property resembling the cocycle equation (I4.2[) ; then, 
one would use that cocycle equation, together with the triviality of some sort of 
"first cohomology group" of the integers, to "integrate" the cocycle and obtain 

63). 

We begin with the first stage. The cocycle property (|4.2p was, of course deduced 
from the assumption that Xh — ^h,9 exactly. We, however, are operating under 
the much weaker assumption that Xh merely correlates with A^0, for many h, up 
to lower order terms. To handle this we use an application of the Cauchy-Schwarz 
inequality due to Gowers [10]. The conclusion of this is as follows. 
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Lemma 4.1 (Approximate cocycle equation). Suppose that f : [N] — >• T> is a 

function, and that for all h in a dense subset H C [-N, N] the derivative Ahf cor- 
relates with Xh for some function Xh '■ Z — > T>. Then for 3> N 3 additive quadruples 
hi, hi, hs, hi G H (that is, quadruples with hi + hi = /13 + hi) one has 

(4.3) \^ne[N]Xh 1 ( n )Xh 2 ( n + h i ~ h ±)Xh 3 {n)XhA n + h i ~ h)\ > 1 - 

Proof. We may clearly replace Xh(n) by e(9h)Xh( n ), f° r anv phases 0;, S R. Choose 
the in such a way that, once this replacement is made, E, n Ahf(n)xh(n) is real 
and positive. Taking expected values over h and making the substitution m := n+h 
gives 

Now apply the Cauchy-Schwarz inequality in the variables m, n in turn to eliminate 
the bounded quantities f(m) and f(n), obtaining 

^m,m',n,n'Xfi-n(^)Xm'-n(^)Xm-n' )Xm'-n'(^ ) 

This is equivalent to the stated result, as one may see upon substituting m — n = hi, 
m — n' = hi, m! — n = /13 and m! — n' — h 2 . D 

Remarks. To relate the above lemma to the preceding discussion of cocycle 
equations, suppose that (|4.2I) is always satisfied. Then one may easily prove that 

(4.4) Xhi (n)Xh 2 (n + hi- h 4 )xh 3 (n)xh 4 (n + h 1 -h 4 ,)=l 

identically whenever hi + hi = /13 + hi, a statement which obviously bears com- 
parison to (|4.3|) . Indeed, from (|4.2j) one has 

XfciW = Xft 3 (™)X/n-fc 3 (n + M 

and 

Xhi (n + h%- hi) = xh 2 (n + hi - h 4 )xh 4 -h 2 (n+hi + h 2 - h 4 ), 
while from the additive quadruple property one has 

Xhi-h 2 {n + hi + h 2 - hi) = Xhi-h 3 ( n + M- 
Putting these together confirms (|4.4I) . It is perhaps interesting to note that little 
has been lost in passing from (|4.2[) to (14.41) (and so we may be confident that little 
has been lost in asserting Lemma I4TT)) . Indeed, if (|4.4j) holds then applying it with 
(hi, h 2 , h 3 , hi) — (h + k, 0, h, k) gives 

Xh+k(n)xo(n + h) = Xh(n)Xk(n + h). 
This is almost (14.21) . Setting 9(n) := Xn(0) and 8'(n) :— Xn(0)xo( n ) then gives 

(4.5) X h(n) =9(n + h)¥(n), 

which is a variant of (|2.4[) . Conversely, it is easy to verify that any Xh of the 
form (|4.5j) (with 6,8' having magnitude 1) obeys (|4.4|) . This helps explain why our 
arguments will end up concluding (|2.4j) rather than (|4.1j) . 

From the properties of nilcharacters mentioned in the previous section, an im- 
mediate corollary of Lemma 14.11 is the following. 

Corollary 4.2 (Top order approximate linearity). Let f : [N] — > V be a function, 
and suppose that for all h in a dense subset H C [— N, N] the derivative Ahf 
correlates with an (s — l)-step nilcharacter Xh- Then for many additive quadruples 
hi, hi, hz, hi G H the (s — l)-step nilcharacter Xh 1 Xh 2 X~h 3 Xhl * s an ( s ~ 2) step 
nilsequence. 



AN INVERSE THEOREM FOR THE GOWERS U 3 + 1 [JVJ-NORM 11 

This corollary asserts that the map h i— > Xh is in some sense approximately 
(affine-) linear to top order. Because it only controls the top order behaviour of Xh, 
this corollary is strictly weaker than Lemma |4.1[ and will turn out to be insufficient 
by itself for the purposes of integrating Xh in the sense of (|4.ip (or (|2.4p ). Eventually 
we need to return to Lemma |4. II and study the lower-order (and more specifically, 
the (s — 2)-step) terms in more detail. Nevertheless, Corollary 14. 21 is an important 
partial result and it yields a crucial linearisation of the family of nilcharacters Xh- 
We turn to the details of this now. 

5. Linearisation 

We now take the approximate linearity relationship in Corollary 14.21 and see 
what this implies about the family of nilcharacters Xh- As motivation, we begin 
by discussing the s = 2 case, which was treated in [10] and developed further in 
[14) . Here, the one-step nilcharacters Xh take the form Xhin) = &(£h n ) for some 
frequency £/, 6 R/Z. Corollary 14.21 asserts in this case that the map h H» £^ is 
approximately linear in the sense that 

(5-1) i hl +& 2 -& 3 -fa =0{-) mod 1 

for many additive quadruples hi + /12 = /13 + hi. 

This type of constraint was analysed in 10 , using what is now called the Balog- 
Szemeredi-Gowers lemma [2|ll0j. together with a version of Freiman's inverse sum- 
set theorem [7] due to Ruzsa [35J. As a consequence of these tools from additive 
combinatorics and a little extra geometry of numbers, one can deduce from (|5.ip 
that the map h t-> £/j is somewhat bracket-linear, in the sense that there exist real 
numbers ot\, . . . , a m , /?i , . . . , /3 m , 7 for some m = O(l) such that one has the relation 

m ^ 

(5.2) a = E a ^J+7 + 0(^)modl 

for many values of h. See [T3] for further discussion and [T^l Appendix C] for a 
guide to how to use the arguments of [2] to supply a proof of this exact claim, 
which was not required there. In particular, we can approximate Xh{n) (modulo 
"lower order terms" ) by the expression 

m 

(5.3) x{h,n) := e(jn) e(a j n[p j h\). 

3=1 

A new innovation in our longer paper to come is to view (|5.3I) as a (piecewise) "bi- 
nilcharacter" of two variables h,n, which is of "bi-degree" (1, 1) in h,n. Informally, 
this means that each bracket monomial that comprises the phase of x(/i, n) is of 
degree at most 1 in h and of degree at most 1 in n. Properly formalising this notion 
of bi-degree involves setting up the notion of a polynomial sequence in quite general 
filtered nilmanifolds; this will be done in the full paper [3D] and we shall say little 
more about it here. Rather, we shall limit ourselves to an illustrative example, 
namely that of describing the sequence (h,n) i-> e(an[(3h\) as a (piecewise) bi- 
nilcharacter of bi-degree (1,1). 

By almost exactly the same computation as in we see that 



e{an[Ph\) = F{g(h,n)T), 
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where here we are working on the Heisenberg nilmanifold G/L, the function F is 
given by F(x, y, z) = e(—z) as before, and now 

g(h,n) :=ef"ef. 

Once again we must note that F is not Lipschitz, but we shall imagine that 
it is for the purposes of this discussion. Given this, the key feature that qualifies 
e(an\J3h\ ) as a bi-nilcharacter is that the polynomial sequence g(h, n) has bi-degree 
(1,1) in the variables h, n. What does this mean? If one introduces the partial 
derivative operators 

9h9(h, n) := g(h + a, n)g(h, n^ 1 

and 

9k9(h, n) := g(h, n + b)g(h, n) -1 , 
then we can easily verify that d^d^g and d^d^g are trivial, that d^d^g or d^d^b 
takes values in G2 = [G, G] , and that any triple derivative of g is trivial. It is 
this package of properties that we refer to as being of bi-degree (1, 1) in the h, n 
variables. More generally, to define a bi-nilsequence of bi-degree (p, q), one needs to 
endow the nilpotent group G with a two-parameter filtration (Guj\)ij^o obeying 
the inclusions G (lj) D <?(*',/) when i' ^ i,f ^ j and [G (lj) , G (M) ] C G (4+fej+;) for 
i,j,k,l ^ 0, and ask that the sequence g(h,n) be such that any mixed derivative 
involving i differentiations in the h variable and j differentiations in the n variable 
takes values in G(i.j) . See [20] for details. 

An example to keep in mind for a bi-nilcharacter of bi-degree (p, q) is that of a 
polynomial phase 

p 1 

(5.4) (h,n)^eC^2J2 a idh i n j ). 

i=0 j=0 

As with our earlier discussion of 1-variable nilsequences this is not an especially rep- 
resentative example and one also needs to model "bracket polynomial" behaviour. 
To give a more complicated example than the one just discussed arising from the 
Heisenberg nilmanifold, 

e(anYPh\jn\\) 

is a (piecewise) bi-nilcharacter of bi-degree (1,2) in h,n. 

Now we turn to higher step analogues of the phenomena just discussed. 

Theorem 5.1 (Linearisation). Suppose that f : [N] —¥ T> is a function such that for 
many h in [—N,N] the multiplicative derivative A^f correlates with an (s — l)-step 
nilcharacter \h- Then there exists a bi-nilcharacter x(h,n) of bi-degree (l,s — 1) 
in h,n and (s — 2) -step nilsequences iph such that Ahf correlates with \(h, -)iph for 
many h £ [-N,N]. 

Remark. Note that in the case s = 1 this is more-or-less precisely the outcome 
of the discussion we had above, in which the phase £^ was shown to vary bracket- 
linearly and then exhibited as a bi-nilcharacter coming from the Heisenberg group. 

We refer to this operation of replacing the family of one-dimensional (s — 1)- 
step nilcharacters Xh(n) by a single "bi-nilcharacter" n) of degree (l,s — 1) 
in h, n as linearisation. Establishing this property is difficult, and occupies the 
bulk of [20] • The starting point for accomplishing this linearisation will be the 
top-degree portion of the approximate cocycle equation, Lemma 14.11 or in other 
words Corollary 14.21 In the converse direction, it is not difficult to show by an 
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algebraic computation that if n) is a bi-nilcharacter of bi-degree (l,s — 1) in 
h, n, then the one-dimensional nilcharacters Xh(n) := x(h,n) obeys the conclusion 
of Corollary 14.21 The reader is invited to do this for the simple example of the 
polynomial phase (|5.4[) with (p,q) = (1, s — 1). 

To obtain linearisation from Corollary 14.21 for a general value of s 3 requires 
five additional ingredients. 

(i) A secondary induction on the "rank" of the nilcharacters being linearised. 

(ii) A "sunflower decomposition" that regularises the frequencies involved into 
"petal" and "core" frequencies. Roughly speaking, the core frequencies do 
not depend on h whilst the petal frequencies vary in a highly independent 
fashion with h. 

(iii) A "Furstenberg- Weiss argument", based ultimately on the quantitative 
equidistribution theory of nilsequences, that shows that every top order 
term in a nilcharacter has at most one petal (genuinely /i-dependent) fre- 
quency. 

(iv) A further application of the quantitative equidistribution theory of nilse- 
quences, together with additive combinatorics, to show that these petal 
frequencies (may be assumed to) vary bracket-linearly. 

(v) An algebraic construction to model these objects, which vary bracket- 
linearly in h and in a "nil-fashion" on n, by a bi-nilsequence x{hi n) of 
bi-degree (1, s — 1). 

We now give a few further details for each of these (somewhat technical) ingre- 
dients in turn. 

(i) The notion of degree and rank. The need for an induction on rank first arose 
in the s = 2 case of linearisation in |19) , in which the (piecewise) nilcharacters Xh 
took the form 



where the . . . denote 1-step factors. It turned out that one had to first fully linearise 
the "rank 2 quadratics" ck/j^nL^^nJ before one could then linearise the "rank 1 
quadratics" jh,in 2 , because the process of linearising the former type of quadratic 
tended to generate error terms that would have to be absorbed into the latter type 
of quadratic. A typical example of such a manipulation arises from the identity 



which equates the rank 2 quadratic e(an \_(3n\ ) with the rank 2 quadratic e(— f3n \an\ ) 
modulo rank 1 quadratic and 1-step errors. 

In the higher step case, one would like to similarly organise various components 
of an (s — l)-step nilcharacter into components of different ranks. If one pretends 
that a nilcharacter x is built up of various bracket monomials of degree (s — 1), 
times lower order terms, then one can heuristically think of the rank of each mono- 
mial as the number of brackets involved in its definition, plus one. For instance, 
e (cm L/3n[_7n 2 L<5nJJJ) is a degree 5 bracket monomial with a rank of 4. 

One can formalise the notion of rank using the calculus of bracket polynomials, 
but the approach taken in |20] is to abstract away the bracket polynomials and 
define rank purely within the formalism of nilcharacters. This is done by a device 




(5.5) 



e(an[f3n\) — e(— (3n[an\)e(af3n 2 )e({an}{l3n}) 
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similar (though not identical) to that used to define bi-nilcharacters of a given bi- 
dcgrcc. Namely, to build an (s — l)-step nilcharacter % of a given rank 7*0, one 
creates a two-dimensional filtration Gu,r) on a nilpotent group G for every given 
degree d and rank r, with the nesting properties G(d,r) =5 G(d',r') when d! > d 
or d! = d and r' > r, as well as the inclusions [G(d,r),G(d',r')] Q Gu+d',r+r') f° r 
all d.r ^ and G^,o) = ^(d,i)i with the hypothesis that G( s _i iro+ i) vanishes. 
One then writes x( n ) = F{9( n )^) where F obeys suitable Lipschitz and vertical 
character properties, and g is a polynomial sequence with the property that the 
i-fold derivatives take values in G( i0 ) = G^i) for all i ^ 0. For details, see |20| . 

(ii) The sunflower decomposition. Suppose that we are dealing with the case 
s = 3 and that, for the sake of exposition, we have Xh(n) = e(ahn[j3hn\). At this 
stage we have no information about how the frequencies cth,fih vary with h. It 
may be that ah is roughly constant in h and that is highly oscillatory in h. If 
this is the case we are actually quite happy, since then some understanding of the 
distribution of XhiXhzXhsXht i n ) as ^1,^2,^3,^4 vary over additive quadruples is 
possible. More bothersome is the possibility of behaviour that is a mix of these two 
extremes, and the sunflower decomposition exists to rule this out. 

Suppose that in some more general setting the set of frequencies of \h is some set 
H/, of size 0(1). In the example just described we have 3/, = {ah, Ph} but in higher- 
step settings these frequencies might come from a host of bracket expressions such 
as e(ahn[/3hnl'~fh'n\\) or e(ahn[/3hn\ |_7/i?\|) or the product of several such terms. 
The aim of the sunflower decomposition is to replace the sets 3^ by new sets 

(5.6) Z h =Z*UZ' h , 

all these sets still having size O(l). Every frequency in S/j is an 0(l)-rational 
combination of those in 3/,, up to a small error. The "core" set 3* consists of 
frequencies which do not depend on h, whilst the "petal" sets Sjj depend on h in 
a very dissociated manner: for most triples /ii,/i2,^3 the frequencies in the union 
S* U Sjj U U Sjj do not approximately satisfy an 0(l)-rational relation. 

We shall say nothing about the proof of the sunflower decomposition here, other 
than that it may be established by iterative refinement; if at some stage the require- 
ments are not met by (|5.6[) , it is possible to add a new frequency to the core set 
and reduce the size of many of the petal sets "El h . Slightly implicitly, this argument 
may be read out of [HI Section 7], particularly Proposition 7.5. 

Once the sunflower decomposition has been established some work is required 
to express the original nilcharacter Xh(n) in terms of objects involving the new 
sets of frequencies Sfc. Recall that the original frequencies 3/, are 0(l)-rational 
combinations of the 3^, up to 0(1). In our work on GI(3) this was done explic- 
itly using "bracket quadratic identities" , the basic idea being that an object such 
as e(ani L/3«2j ) is multilinear up to lower-order terms. In the more general pa- 
per to come, these issues are instead dealt with in a more abstract fashion, using 
nilsequences. 

(iii) The Furstenberg-Weiss argument. For simplicity let us suppose that s = 3 
and imagine that, following step (ii), the top-order term of Xh(n) is a product of 
terms such as e(a^n \fih n \ ) , where the frequencies ah , Ph belong to frequency sets 5^ 
which have been decomposed as 3* U "E! h according to the sunflower decomposition. 
The aim is to show that (after refining the set of h) we do not have ah,Ph 6 5^. 
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That is to say, there are no terms with more than one petal frequency. Put another 
way, no more than one frequency in any bracket monomial genuinely depends on h. 

The argument proceeds by studying the conclusion of Corollary 14.21 using an 
argument of Furstenberg and Weiss. For simplicity, let us just discuss a model case 
in which s = 3 and each Xh is essentially of the form Xh( n ) — e(ah7i|_/3/ifiJ). This 
was already treated in detail in |19[ Lemma 7.3]. 

Lemma 5.2 (Furstenberg- Weiss argument, model case). Suppose that for ijk = 
123,124, the six frequencies ah t , Phi > ®-h < > Ph < > a h k > Ph k are linearly independent in 
the sense that there is no non-trivial linear combination of these six frequencies with 
bounded integer coefficients that is equal to 0(1/N) modulo 1. Then XhiXh 2 XhsXhi 
has negligible mean, and more generally does not correlate with any 1-step nilse- 
quence. 

We remark that we will find ourselves in exactly this situation if there are many 
h such that Xh {n) contains a petal-petal combination. The conclusion of this lemma 
then contradicts Corollary 14. 21 

Proof. (Sketch) For notational simplicity we just sketch the claim that the mean 



where e^i, generate copies Gj of the Heisenberg group with corresponding dis- 
crete subgroups Tj , and Fj is a suitable function. The mean (|5.7p is then controlled 
by the equidistribution of the orbit 



in a product (Gi/Ti) x . . . x (G4/T4) of four Heisenberg nilmanifolds. 

An application of a quantitative version of Leibman's theorem |26j on equidis- 
tribution of polynomial orbits in nilmanifolds, established by the first two authors 
in [TB], tells us (roughly speaking) that this orbit is equidistributcd on a sub- 
nilmanifold H/T, of (G1/T1) x ... x (G4/T4), where H is a closed subgroup of 
G\ x . . . x G 4 ; the mean (|5.T|) is then essentially the integral of the tensor product 
-FiGS-Fa (&F3 (gii^ on this subnilmanifold. The linear independence of the frequencies 
cthi , Phi, Q-hj , Phi a h h , Ph k for ijk = 123 can be used to show that the projection 
from H to G\ x G2 x G3 is surjective; similarly, the same hypothesis for ijk = 124 
can be used to show that the projection from H to G\ x G2 x G4 is surjective. Taking 
commutators, one then concludes that H contains [Gi, Gi] x {id} x {id} x {id} as a 
subgroup. From this and the non-trivial oscillation of F% we see that Fi®F2®F3®F4 
has mean zero on H/T,, and the claim follows. □ 

The above argument may be used to rule out the possibility that Xh(ji) = 
e(ahn\J3hn\) with both ah and fih being petal frequencies, since in this case al- 
most all additive quadruples hi + hi = /13 + /14 will satisfy the hypotheses of the 
lemma, leading to a contradiction of Corollary 14.21 A very similar, but more no- 
tationally intensive, argument may be used to rule out a more general possibility: 
that Xhin), which could in general be a product of many terms like e{a.hn\_f3hn\ ), 
contains one such term with two petal frequencies. 



(5-7) ^neimXhtXhiXhzXhM) 

is negligible. We write each Xhj (n) as a nilcharacter 



f OLh .71 ph ■ 

Xh 3 {n) = Fjie^ e j2 J 



mod Tj) 




mod r$ =1 
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(iv) Additive Combinatorics. Let us persist with the model setting in which 
s = 3 and Xh{n) is a product of terms of the form e(ahn[f3hn\). As a consequence 
of part (iii), we may assume that in each such term only one of cth,(3h genuinely 
depends on h (i.e. is a petal frequency), the other frequency being core. A simple 
model to consider is that in which Xh(n) = e(ah.n\J3n\ ). 

We then re-examine Corollary 14.21 in the light of this new structural informa- 
tion on Xh{n). By a further argument of Furstenberg- Weiss type, very similar to 
the above, one may show that ah satisfies a relation of type (|5.1[) . Applying the 
same additive-combinatorial machinery (the Balog-Szemeredi-Gowers theorem and 
Freiman's theorem) we may replace ah by a bracket-linear object as in (|5.2j) . De- 
tails of this type of argument in the case of GI(3) may be found in [19] Section 
8]. 

(v) Constructing a nilobject. We have, at this point, shown that the top-order 
terms of Xh(n) vary in a somewhat "rigid" or algebraic way - more specifically, the 
/i-dependence is bracket-linear. The remaining task in the "linearisation" part of 
the argument is to identify these top-order terms as coming from a bi-nilsequence 
x{h,n). In previous works on the inverse conjectures such as that of the first 
two authors on the [7 3 -norm [14] and the authors' treatment of the [7 4 -norm [19] 
this "nilobject" was constructed in a rather ad hoc manner. In the former paper 
suitable products of Heisenberg nilmanifolds were exhibited, whilst in the latter 
the free 3-step nilpotent group on a suitable number of generators was considered. 
We also remark that, in both of these works, the nilobject was constructed at the 
very last step of the argument, rather than prior to the symmetry argument (to be 
discussed in the next section) as here. In our longer paper [20] we introduce a more 
systematic construction based on a semidirect product. Rather than describe this 
in any kind of generality we merely outline an example of the construction. Suppose 
that ah '■= j{Sh} and that we know, for fixed h, how to construct the nilcharacter 
Xh(n) = e{a.hn\fin\) . We do, of course, since it comes from a Heisenberg example: 
however the description that follows works in much greater generality. Then we 
show how to realise Xh{n) as a bi-nilsequence. 

The reader might briefly recall, at this point, the construction of Xh{n) as a 
nilcharacter on the Heisenberg as given in fj3] namely 

Xh (n) = F{g h (n)T) 

with g h {n) = ei" n ef™. We note once more that F is not Lipschitz, and so Xh(n) 
is not quite a true nilcharacter, but we shall pretend that it is for the purposes of 
this announcement. 

We turn now to the interpretation of Xh( n ) as a bi-nilsequence in h and n. The 
first task is to identify a subgroup G po tai of the Heisenberg group G representing 
that part of G that is "influenced by" the petal frequency ah ■ In our setting this is 
very easy; simply take G pe tai to be the subgroup of G generated by e\ and [ei, e?\. 
Note that G po t a i is abelian and normal in G. These features are quite general and 
hinge on the fact that there is only one petal frequency in Xh(n). Of course, it 
was precisely to achieve this that we worked so hard in (iii) above. In particular G 
acts on Gp tai by conjugation and we may form the semidirect product G X G pe tai, 
defining multiplication by 



(9,91) ■ (g',g[) = {gg\g{ g[), 
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where a b := b 1 ab denotes conjugation. 

Now consider the action p of K on G K G pc tai defined by 

P{t)(g,9i) ■= {99i,9i)- 
We may form a further semidirect product 

G:=Kx p (GKG po tai), 

in which the product operation is defined by 

(*, (9, 9i)) ■ (t, (<?', g[)) = (t + f , p(t')(g, g x ) ■ (g\ g[)). 

G is a Lie group; indeed, one easily verifies that it is 3-step nilpotent. Inside G we 
take the lattice 

f : =z k p (r k r petal ), 

where r potal := V n G pcta i- 

We will construct Xh{n) as a bi-nilsequence F(g(h, n)T) for suitable F : G/T — > C 
and an appropriate polynomial sequence g : Z 2 — 5- G. For take 

§(fc,n) := (0,(ef,e7"))-(tt,(id,id)) 

and observe that 

g{h,n)t = (0,(ef,er))-(W,(id,id))f 
= (W,(ere{ 5 ' i}7 ",e7"))f. 
Finally, take F : G/f — > C to be the function defined by 

whenever ^ t < I and 5 lies in the fundamental domain of G/T. By exactly the 
same computation as for the Heisenberg group we have 

F{g{h,n)t) = e{ 1 {8h}n[pn\)=xh{n), 

which is exactly what we wanted. 

This completes the discussion of point (v) in the model case of a rather clean 
and simple collection of nilcharacters Xh(n) on the Heisenberg group. Even here, 
we have omitted details: for example, one must carefully place a filtration on G 
and confirm that the new bi-nilsequence x(^-i n ) nas the claimed bi-degree, namely 
(I, 2) in this case (note, however, that we have not even properly defined bi-degree 
in this announcement). The difficulties involved in doing this, and in generalising 
the semidirect product construction just described, are largely notational. 

With a brief discussion of each of the five points (i) to (v) now completed, we 
have concluded our sketch proof of Theorem 15.11 

6. Symmetrisation 

We turn now to the final part of the argument. Let us begin with a summary 
of our current position, which is the result of applying the observation (|3.3I) and 
the rather substantial Theorem 15.11 Together, these tell us that if / : [N] — > V is 
a function with large U s+1 [7V]-norm then there is a bi-nilcharacter x(/i, n) of bi- 
degree (1, s — I) in h, n such that for many h 6 [— N, N], Ahf correlates with x{h, •) 
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modulo (s — 2)-step errors. We would like to "integrate" x(h, n) by expressing it 
in the form 

X(h,ri) = A h 0(n) ■ iph(n), 
for some s-step nilcharacter 6 and some (s — 2)-step nilcharacters i/ih- 

To see what is necessary to achieve this, let us proceed heuristically as at the 
start of fJH and suppose that x( n i n ) — Afc#(n). Then we have 

X (h, n + k) X (h,n) = A k A h 0(n) = A h A k 6(n) = x(k, n + h) x (k,n). 

This "symmetry" relation, which will certainly not be satisfied by an arbitrary 
binilcharacter x(h,n), suggests that, even in our rather weaker setting, we must 
obtain further information about x before we can complete our task. 
For instance, if one had 

X(h,n) « e(ah |_/3nJ) 

(where w informally denotes equivalence up to lower order terms) then there does 
not appear to be any reasonable candidate for the antiderivative 8, whereas if 

x(h, n) w e(ah [/3n\ + an [/3h\ ) 

then x{h,n) = A) l 6{n) up to lower order terms, where 9(n) := e(an[pn\). The 
obstruction here is analogous to the basic fact in de Rham cohomology that in order 
for a 1-form u to be exact (i.e. to be the derivative U) = df of a scalar function), it 
is first necessary that it be closed (i.e. dui = 0). 

The need for this symmetry, and the means for obtaining it, was first addressed 
in P31 [3D] as part of the proof of GI(2). A somewhat different argument of this 
nature later appeared in [19] as part of the proof of GI(3). In the former case, 
this symmetry was obtained by a Cauchy-Schwarz argument that was similar (but 
subtly different) from the one used to establish Lemma \A. II In the latter case, we 
inspected the lower order terms of Lemma 14.11 and we do the same here. In our 
present setting, this lemma implies that 

(6.1) E ne [ N ]X(hi,n)x(h2,n + hi - h i )x(h 3 ,n)x{h4, n + hi — h4,)Jho(hi,n) > 1 

for many additive quadruples h\ + = + /14. Here, and in everything that 
follows, we use the symbol J() to denote "junk terms". Here, these are terms of 
"lower order" (hence the subscript LO); later on J will also be allowed to include 
terms, depending only on some strict subset of the variables, that are destined to 
be annihilated by applications of the Cauchy-Schwarz inequality. We will denote 
these by a subscript CS. 

Let us pause to recall the remarks immediately following the statement of Lemma 
14.11 to the effect that very little was "lost" in proving that lemma. It should not, 
therefore, come as a surprise that (|6.1[) is in principle enough to proceed; however, 
actually making use of this observation is surprisingly tricky. 

Let us specialise to the case s = 4 and for the sake of this discussion suppose 
that x{h,n) — e(T(h,n,n,n)), where T : [N] 4 — > E/Z is to be thought of as a 
"bracket linear form" such as 

T(m, n 2 , n 3 , ni) = aniL/S^LT^L^JJJ- 

Since T only appears in the expression T(h,n,n,ri) we may assume that T is 
already symmetric in the last three variables, by replacing T{n\, n%, n^, n^) with 

- ^2 T ( n ii n TT(2),n^ {3) ,n 7l{4) ). 

7reS 3 
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We need only establish, then, some symmetry in the first two variables of T. 

Substituting into (|6.1J and parametrising additive quadruples as h\ = h, h% = 
h + a + b, I13 = h + a, hi = h + b we obtain 

V"n,h,a,be(T(h, n, n, n) + T{h + a + b,n — b,n — b,n — 6) — 

T(h + a, n, n, n) — T{h + b,n — b,n — b,n — &))Jlo(0 3> 1- 

If T were genuinely quartilinear this would collapse (using the symmetry in the last 
three variables) to give 

(6.2) E n<a>b e(-3T(a, b, n, n))J LO (0 » 1, 

where Jlo() is only linear in n. Of course, T is not genuinely quartilinear but rather 
"bracket quartilinear" . In practice this means that T is quartilinear "up to lower 
order terms" , a phenomenon best understood, but perhaps harder to explain in a 
brief overview, by thinking of x(/i, n) as a nilobject rather than as a bracket object. 
After formalising this approximate quartilinearity, one may eventually assert, in 
place of f)G . 2f) . a statement of the form 

(6.3) E n<a<b e(-3T(a, b, n, n))J L o,cs(-) > 1, 

wherethe subscript CS in Jlo,Cs(0 implies that this error term is not necessarily 
of lower order (degree 1) in n, but the non-linear terms depend only on one of the 
variables a, b and will at some later point be removed using the Cauchy-Schwarz 
inequality. Applying Cauchy-Schwarz to this yields 

E n>a>bib >e(-3T(a,b,n,n) + 3T(a, 6', n, n))J L o,cs(-) > 1- 

Now the non-linear terms in Jlo,cs( - ) are independent of a, and depend only on 
one of the variables b, b' . Substituting c := a + b + b' gives 

E n)C , b)b ,e(-3T(c - b - 6', b, n, n) + 3T(c - b - b' , b' , n, n)) J LO ,cs(0 > 1- 

In particular, there is some value of c such that 

E„ A ,/e(-3T( c - b - b', 6, n, n) + 3T(c -b-b', b', n, n))J LO ,cs(-) » 1. 

Using multilinearity (modulo lower order terms) and absorbing any terms depending 
on only one of b, b' into the junk term J() we obtain 

E nAb ,e{3T(b', b, n, n) - 3T{b, b' , n, n))J LO) cs(-) » 1. 

At this point we have a statement that certainly seems to be asserting at least some 
kind of symmetry in the first two variables of T, which is of course our eventual 
goal. 

Further manipulations are required to turn it into something usable. Write 
ip(b, b , n, n) := 3T(6', b, n, n) — 3T(&, 6', n, n); 

thus 

(6.4) E n>6 , 6 ,e(V>(&, b', n, n))J LO> cs(0 » 1. 

The junk term J(-) is comprised of terms Jlo °f lower order in n, and also of 
terms Jcs depending on only one of the variables b, b' . By two applications of the 
Cauchy-Schwarz inequality we may eliminate these latter terms, obtaining 

E n ,& 1 ,&i,!> 2 ,&£e(V>(&i, b[, n, n)-ip(b 2 ,b' lt n, n)- 

ip(bi,b' 2 , n, n) + ^(b' 1 ,b' 2 ,n,n))3 LO (-) > 1, 
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where now the junk term Jlo consists only of terms that are of linear nature in n. 
In particular, on average in bi,b[,b 2 ,b 2 , the Gowers {7 2 -norm of 

e(V>(&i,&i,n,n) - V0 2 ,&i,n, n ) ~ ^{b\,b' 2 ,n,n) + tp(b' 1 ,b' 2 ,n,n)) 

is large. Writing this out in full and using the fact that ip is quartilinear up to lower 
order terms implies that 

e(2V>(6i, b[, h lt h 2 ) - 2^(62, b\, h u h 2 ) ~ 2^(&i, b' 2 , h u h 2 ) + 2^{h,b' 2 , h u h 2 )) 

correlates with a lower-order object. By pigeonhole there is some choice of b 2 ,b' 2 
such that the expectation over the remaining variables hi, h 2 ,bi,b[ is still 3> 1. For 
these fixed b 2 , b' 2 the terms involving b 2 , b' 2 are of lower order in bi,b[, hi,h 2 , as a 
result of which we conclude that 

e(2il>(b, b 1 , h u h 2 )) = e(6T{b\ b, h u h 2 ) - 6T(6, b', h u h 2 )) 

correlates with a lower-order object. 

This expresses a certain symmetry of T(m, n 2 , 113, 714) in the first two variables, 
and this is enough to complete the "integration" of x(/i, n) and hence the proof of 
the inverse conjecture GI(s). 
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